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Abstract 



X 

An explicit algorithm for performing Schumacher's noiseless compression of 
quantum bits is given. This algorithm is based on a combinatorial expres- 
sion for a particular bijection among binary strings. The algorithm, which 
adheres to the rules of reversible programming, is expressed in a high-level 
pseudocode language. It is implemented using 0(n 3 ) two- and three-bit prim- 
itive reversible operations, where n is the length of the qubit strings to be 
compressed. Also, the algorithm makes use of 0(n) auxiliary qubits; however, 
space-saving techniques based on those proposed by Bennett are developed 
which reduce this workspace to 0(y/n) while increasing the running time by 
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less than a factor of two. 
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I. INTRODUCTION 



There is considerable interest in the controlled generation, manipulation and transporta- 
tion of individual quantum states; applications of such resources are envisioned in new kinds 
of data transmission, cryptography and computation. The quantum extension of conven- 
tional bits, called qubits, have been subject to considerable exploration lately. A single 
qubit is embodied in the state of a single two-state quantum system, such as the spin degree 
of freedom of an electron or other spin-| particle, where the spin-up state of the particle 
is denoted by |0) and the spin-down state is denoted by |1). The basic laws of quantum 
physics dictate that a description of the entire possible state-space of the qubit is given by 
the wavefunction 

|*> = a|0)+)9|l), (1) 

where a and (3 are any two complex numbers such that \a\ 2 + \(3\ 2 = 1. This is called a 
"qubit" since it can assume one of two binary values, but of course it has fundamentally 
different properties because of the possibility of it being in a superposition of these two 
values. The properties with which quantum mechanics endows the qubit make possible a 
kind of cryptography which is fundamentally secure against eavesdropping attacks and 
computations which apparently violate the complexity-class categorizations for ordinary 
boolean computers @]. 

One of the ideas of this sort that has been understood recently is the possibility of data 
compression for qubits. In classical information theory, if n bits, each sampled 

independently according to some probability distribution p = (po,Pi) (on the set {0, 1}) then 
the string x\ . . .x n may be compressed to a nH s (p)-bit string (where H s (p) = — Y^l=oPi ^°SPu 
the Shannon entropy ||) — and no further — in the following asymptotic sense. For any 
e, 5 > 0, for sufficiently large n, for any A(n) > n(H s (p) + 5), X(n) G {1, . . . , n}, there exists 
a compression scheme that compresses x\ . . . x n to yi . . . y\( n ), and such that X\ . . . x n can 
be successfully recovered from y\ . . . yx(n) with probability greater than 1 — e. Moreover, the 
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above compression is the maximum possible in the sense that, for any e, 5 > 0, for sufficiently 
large n, for any A(n) < n{H s {p) — 5), for any compression scheme that maps X\ . . . x n to 
Ui ■ ■ ■ U\(n), the probability that x± . . . x n can be successfully recovered from yi . . . y\( n ) is l ess 
than e. 

The quantum physical analogue of the above scenario involves the compression of a string 
of qubits, instead of bits. Note that there are a continuum of possible states for each qubit, 
rather than two possible values. We shall consider the "discrete" case, where a probability 
distribution is concentrated on some finite set of qubit states S = • • • , |^m)}- Let 

the respective probabilities be p = (pi, . . . ,p m ). In the language of quantum physics, (S,p) 
defines an ensemble of states. Let \aii) . . . \a n ) be a string of n qubits, each sampled indepen- 
dently from (S,p). Define a compressor A as a unitary transformation that maps n-qubit 
strings to n-qubit strings. Again let A(n) G {1, . . . , n}. It is to be understood that, on input 
\a±) . . . \a n ), the first X(n) qubits that are output by the compressor \/3i . . . /3\( n )) are taken 
as the compressed version of its input, and the remaining n — A(n) qubits are discarded. A 
decompressor B is a unitary transformation that maps n-qubit strings to n-qubit strings. It 
is to be understood that the first A(n) qubits input to the decompressor are . . . f3\( n )), the 
compressed version of some sequence of n qubits, and the remaining n — A(n) qubits are all 
|0). An n-to-X(n) quantum compression scheme is a compressor /decompressor pair (A,B). 
As in the classical case, the goal is to achieve as high a compression rate (i.e. as small a 
A(n)) as possible, while permitting the original message to be recovered from its compressed 
version, with high probability. 

Assume that the compressor knows (i.e. can be a function of) the underlying ensemble 
(S,p), but has no explicit knowledge about the specific random selections made (inter- 
estingly, compressors exist that know even less than (S,p); more about this later). In the 
classical case, the compressor obtains complete information about the bits to be compressed, 
but complete information cannot generally be obtained from a qubit. If the possible qubit 
states in S are not mutually orthogonal then any observation of such a qubit will only yield 
partial information about its state, and can irretrievably change this state. Due to this, 
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one might expect to be able to achieve less in the quantum scenario than with classical 
compression schemes — in fact, the opposite is true. 

Let us measure the quality of an ra-to-A(n) compression scheme (A, B) with respect 
to a source distribution p in terms of its fidelity, defined as follows. Consider the following 
experiment. Let the sequence . . . \a n ) be sampled independently from (S,p). Transform 
\ai) . . . \a n ) according to the compressor A and let . . . /?a(«)) be the compressed version. 
Next, transform \@i . . . P\( n ))\0 • • • 0) according to the decompressor B and let \a[ . . . a' n ) be 
the output. Finally, measure \a'i . . . a' n ) with respect to a basis containing \a\ . . . a n ). The 
fidelity is the probability | (a[ . . . a' n \a\ . . . a n )\ 2 that this measurement results in \ai . . . a n ). 

Note that the fidelity is with respect to two sources of randomness: (a) the random 
choices in the original generation of \a% . . . a n ); and (b) the randomness that results from 
performing a measurement of the state \a[ . . . a' n ). Roughly speaking, the fidelity can be 
high if for "most" choices in (a), \a[. . . a' n ) is "close to" \a± . . . a n ). 

The ensemble (S,p) represents a mixed state, which has density matrix p, defined as 

m 

p = X>l*i)<tti|. 

i=l 

The von Neumann entropy corresponding to (S, p) is defined in terms of the density matrix 
p as H VN (p) = — Tr(plogp). In general, H YN (p) < H s {p), with equality occurring if and only 
if the states in S are mutually orthogonal. 

Roughly speaking, Schumacher's theorem j| states that nH VN (p) is asymptotically the 
maximum compression attainable for n qubits resulting from a source with density matrix 
p. More precisely, let (S, p) be any ensemble of qubits, and p be the corresponding density 
matrix. Then, for all e, 5 > 0, for sufficiently large n and \{n) > n(H VN (p) + 5), there 
exists an n-to-A(n) quantum compression scheme for (S,p) with fidelity greater than 1 — e. 
Moreover, for all e, 5 > 0, for sufficiently large n, if X(n) < n(H VN (p)—5) then every n-to-A(n) 
quantum compression scheme has fidelity less than e. 

It should be noted that the above bounds are robust in the sense that they do not change 
when a number of technical variations are made in the scenario. For example, the n-to-A(?7.) 
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compression schemes that attain fidelity greater than 1 — e can restricted to being highly 
"oblivious" in that they depend only on knowing a basis for which the density matrix is 
diagonal, with nonincreasing values along the diagonal. Also, || even if the compressor is 
supplied with complete information about the state of the source string \a\ . . . a m ) that it 
receives, e still bounds the fidelity attainable if X(n) < n(H VN (p) — 5). 

The proof of Schumacher's Theorem is based on the existence of a "typical subspace" 
A of the Hilbert space of n qubits, which has the property that, with high probability, a 
sample of \a±, . . . , a n ) has almost unit projection onto A. It has been shown that the 
dimension of A is 2 nHvN ^; thus, the operation that the compressor should perform involves 
"transposing" the subspace A into the Hilbert space of a smaller block of nH VN (p) qubits. 

Bennett || gives a more explicit procedure for accomplishing this "transposition" , which 
we illustrate with an example. Suppose that S = {|^i), l^)}, where = |0) and 

l^ 2 ^ = 73 1*-*) + 71 1-'-)' an< ^ ^ = G°i>£>2), where p\ = P2 = \- The density matrix corresponding 
to (S,p) is p = ||0)(0| + |(^|0) + ^|1»(^(0| + ^(1|), or, in 2x2 matrix form, 



4 4 

1 1 
\4 ZJ 



in the basis 



|0> 
|1> 



(2) 



It is always possible to go to a basis in which the density matrix is diagonal: 



A r 



/ 



4 + 4 tan | 







V 







in the basis 



|0') = cosf |0) +sinf |1) 



(3) 



| — 4 tan | j 



|1') = - sin §|0) + cosf |1) 

Both of the states and |^ 2 ) have large overlap on the basis state |0') (|(^j|0')| = cos |), 
and small overlap on the orthogonal basis state |1') (K^ll')! = sin|). This observation 
leads to a way of compressing strings of signal states. Consider all n-qubit strings possible 
from the states in S. These strings can all be expressed with respect to the basis consisting 
of |x„_i . . . xq) = |xn-i) . . . |xo), where x n _i, xo G {|0'), |1')}. Each such \x n -i...xo) 
can be interpreted as an n-bit binary number, and, thus, can be denoted as \x), for x G 
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{0, . . . , 2 n - 1}. Now, the overlap of \x) with the states in S n is \ (x\S n )\ = cos" 1 f sin™"™ |, 
where m is the number of 0's in the binary representation of x. Because this overlap 
diminishes exponentially with n — m, basis states with large numbers of l's are relatively 
unimportant for describing any string \a>i, . . . , a n ); the Hilbert space can thus be truncated 
to the typical subspace A consisting of all states \x) in which the binary number x contains 
a proportion of l's less than H VN (p) < 0.601. 

Thus the "transposition" which the coder must do consists of mapping this A subspace 
for n qubits into the states spanned by less than 0.601n of those qubits. 

We must accomplish this by a unitary transformation applied to the original states of the 
n qubits. In the basis |0), . . . , |2 n — 1), this transformation must map qubit strings with the 
smallest number of l's in succession into qubit binary strings with the smallest numerical 
value. This is a classical combinatorial calculation, "classical" in the sense that definite 
binary-number states are mapped to other definite binary-number states; however, it is 
essential that the computation be performed quantum mechanically, since the computation 
must preserve the superpositions of these basis states. This means that the combinatorial 
computation must be performed using reversible, quantum-coherent elementary operations. 

The principal object of this paper is to derive the quantum computation which is needed 
to do this Schumacher coding. In Sec. [n] we derive the analytical formula for the sorting 
calculation required for the coding. Sec. p II] constructs the quantum program for perform- 
ing this calculation: Sec. |III A| illustrates a first attempt at this coding exercise; Sec. [Ill B 



discusses the way in which the calculation is to be properly made reversible; and Sec. Ill C 



which contains the essential result of the paper, gives the final quantum program for Schu- 



macher coding. Sec. |V| gives, in the same programming notation developed in the earlier 
sections, the bit-level routines needed for performing the steps in the high-level program. 
Appendix |A| discusses how these bit-level routines may be made highly space-efficient, with 
only a modest increase in running time (these latter routines result in a smaller time-space 
product, which may be desirable RJ3]). Appendix IB] provides other ways of economizing 



in the bit-level implementation of these codes, by using some of the phase freedom coming 



from the quantum-mechanical nature of the computation. 



II. COMBINATORIAL EXPRESSION FOR SCHUMACHER CODING 

As Bennett || has described, a specific realization of the unitary transformation per- 
forming the Schumacher coding function on a set of identical qubits consists of a sorting 
computation in which the states |0), . . . , |2 n — 1) are given a lexicographical ordering ac- 
cording to how many l's are in their binary expansion. So, |0) is mapped to itself, all the 
states containing exactly one 1 and n — 1 O's are mapped to the states between |1) and \n), 
all the states with exactly two l's and n — 2 O's are mapped to the states between |n + 1) 
and \n + n(n — l)/2), and, in general, all the states with exactly m l's and n — m O's are 
mapped to the states between 



inclusive. The Schumacher function does not require any particular ordering of the states 
within each of these blocks, except that the mapping must be 1-to-l (i.e., a bijection); but, it 
turns out to be convenient to preserve lexicographical ordering within each block. Defining 
the index number within each block as I[x,n,m], the total Schumacher function for string 
x (with n bits and m l's) is 



The index number / obeys a recursive relationship which we now derive. Considering the 
possible binary-number strings representing the input state x, any string whose first 1 occurs 
in the p + 1 st place (i.e., whose first p bits are 0) must have a higher index number than all 




(4) 



and 




(5) 




(6) 



strings in which the first p + 1 places are 0. There are exactly 




means that for the particular input string 



S 



p O's n—m—p O's m—1 l's 

x = oo^Biooo*^~ooTTT^~TT, (7) 

the index number I[x,n,m] = ( n ~„ -1 )- This result permits the index number of the more 
complex string 

p O's 

x = 00^001^ (8) 

n — p — 1 bits 

to be expressed recursively: 

(n — p — l\ , 
I[x, n, m\ — I I + I [x , n — p — 1, m — 1J. (9) 

It is probably easiest to understand Eq. flj^) by writing out an example: 

J[0010011011,10,5] = ( 1 % 2_1 ) +/[0011011,7,4] 

( 7 -J- 1 )+/[1011,4,3] (10) 

( 4 ^ 1 )+J[011,3,2] 

0. 

As this illustrates, the recursion of Eq. (P) may be iterated to produce an expression for / 
for a general input string x: 

I[x,n,m] = J2x n -J ™ J V (11) 



Here the notation x p denotes the value of the p th bit of the string x. Combining Eq. (|TT|) 
with Eq. (^|) yields the final expression for the Schumacher coding function: 

»- s ("MMeloJ- (i2) 



In this equation, binary coefficients outside their natural range (e.g., (^"-J) are understood 
to be zero. 
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III. HIGH-LEVEL QUANTUM PROGRAM FOR SCHUMACHER CODING 



A. first attempts 



It is now our object to translate Eq. (12|) into a sequence of elementary quantum- 



mechanical manipulations. We proceed to do this by writing out the calculation in a 
high-level "pseudocode" [|9] which, when "compiled", would permit the operation to be 
performed by a sequence of elementary spectroscopic manipulations such as two-bit XOR's 
(or controlled-NOT's), along with one-bit rotations [|Kj. Rather than building up the rules 



of this pseudocode axiomatically, we will proceed in an intuitive fashion. The principal con- 
straint which the coded calculation must obey is that it be done reversibly. Instead of going 
into a discourse about this, let us present the first try at coding Eq. ([12]) (not a perfectly 
successful one, in fact): 

Program FIRST_TRY 
quantum registers: 

X : n-bit register 

Y : n-bit arithmetic register (initialized to 0) 

S : [log n] -bit arithmetic register (initialized to 0) 

if X Q = 1 then S <- S + 1 
for j '• = 1 to n — 1 do 

if Xj = 1 then S <- S + 1 

for m = to j + 1 do 



if X j = 1 and S = m then Y <- Y + ( J 
for i = to n — 1 do 

if i + 1< S then Y <- Y 



FIRST_TRY is not incorrect, but it is incomplete, in ways which we will repair by stages 
below. Here are some rules of this programming: All the quantum-mechanical registers are 
in capital letters. In FIRST_TRY, these are X (which is initialized with an input state x, 
or a quantum superposition of such input states), Y (which is initialized to 0, and whose 
final value is the output state y or their quantum superpositions), and S (a small work 
register, also initialized to 0). The notation Xj indicates the i th bit of X. Note that Y 
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and S are given the data type "arithmetic" , indicating that ordinary integer addition and 
subtraction are allowed with them. Only bitwise manipulations are performed on X. (In 
the FIN AL_S CHUM ACHER program, both bitwise and arithmetic manipulations will 
be performed on the same registers.) 

All other lower-case variables in the program always have definite values and can (and 
should) be implemented using classical bits. Only the quantum registers need to be explicitly 
treated reversibly. So, the binomial coefficients (j^j can be precomputed or evaluated by 
any means, reversible or not, in the implementation of the quantum computation. 

In a reversible program statement, the input can always be deduced from the output. 
So, for example, the statement 

if X = 1 then S <- S + l 
is reversible, because the input could be deduced by the "time-reverse" of this statement, 

if X = 1 then S<-S-l 
An irreversible program statement would be 

if X = 1 then S «- 1 

since the prior value of S cannot be deduced. As it happens, this statement would function 
correctly in FIRST_TRY because S actually is equal to at this first executable statement 
of the program. However, we will enforce a rule that the only irreversible statements per- 
mitted that involve quantum variables will be the "initialized" designations present in the 
declaration statements. In later programs we will introduce a "finalized" designation, which 
will merely serve as a reminder that certain variables will always end the program with a 
particular value if the program runs correctly. This designation will be an important one in 
constructing reversible code. It is also a reminder that physically, the finalization can serve 
as a useful check that no error has occurred []TT|; a quantum measurement of this register at 
the end of the running of program should always find the register in the finalized value. 
One further comment about the program statement 
if X Q = 1 then S<-S + l. 
If S were a one-bit variable, this statement would just be a quantum XOR or controlled 

11 



NOT, in which the value of S is inverted conditional on the value of Xq. In FIRST_TRY, 
S is a multibit register, in fact it must have about log 2 n bits. Implementation of these 
multi-bit functions in terms of primitive operations involving no more than three bits is 
straightforward, and is presented in Sec. [TV] and in Ref. p~^JT3[1 . Using quantum gates, all 



the three-bit primitives may be reduced to sequences of two-bit operations [TO 



A few more points about FIRST_TRY are in order. Given the constraints of reversibil- 
ity, it is a relatively straightforward transcription of Eq. fll2). The first for loop (indexed 



by j) implements the second term of Eq. fll2[) ; this is efficient because the partial sum in the 

binomial coefficient can be accumulated in S one term at a time, and then the completed 

sum can be used as the upper limit of the first term of (0), which is implemented in the 

second for loop. This inner m loop could be replaced by the single statement 

if Xj = 1 then Y <- Y + (j) , 

but this would require a reversible calculation of the binomial function; we have chosen to 

make this binomial-coefficient calculation classical by writing out the loop as shown. One 

might also be tempted to modify the inner loop as follows: 

if Xj = 1 then 

for m = to j + 1 do 

if S = m then Y <- Y + (£) 

While moving the if statement from the loop is superficially more efficient, it turns out that, 

when these statements are re-expressed in terms of primitive operations, the if must be 

carried down to the lowest level in any case; so, we prefer a syntax in which such conditionals 

are explicitly shown at the lowest level. 

B. Reversibility considerations 

Now, what is the overall effect of FIRST_TRY, and why is it inadequate for performing 
the Schumacher function? Let / : {0, l} n — > {0, l} n denote the Schumacher function for 
n-bit binary strings. If the total input state is expressed as the ket 
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\X,Y,S) = \x,0,0), (13) 

then the complete final state is 

\X,Y,S) = \x,f(x),s), (14) 

(where s is the number of l's in x). But, the correct Schumacher function must have a final 
state of the form 

\X,Y,S) = \0,f(x),0). (15) 

That is, the input x should be erased and the work register S should be reset to its initial 
value of 0. This is possible to accomplish reversibly because the Schumacher function is 
bijective, so that no record of the initial state, or of the state of the work bits, needs to 
be retained at the end; they are completely deducible from the output. In fact, the correct 
operation of the Schumacher function requires that the output be of the form flT5|); if it is 
of the form of (|I~4"D, then the final state is "entangled" with the initial state, which means 
that output states cannot be placed in the desired superpositions of states. Thus, the net 
result of the Schumacher function should be confined to the input data register only; this 
condition is obtainable from Eq. (|i~5f) if the final output state is swapped so that the state 
vector becomes 

\X,Y,S) = \f(x),0,0). (16) 

Thus the Schumacher function is applied, "in-place" , to the first n qubits, while the remain- 
ing n + log 2 n bits return to their original states, and may all be viewed simply as work space 
for the computation. We will see later that the "output" register Y can actually be removed 
entirely by using some clever programming. Some other workspace, not displayed explicitly 
in fll6|), appears to be necessary to do the bit-level manipulations in the Schumacher function 
(see Sec. |V[); Appendix |A] shows that the size of this extra workspace does not have to 
exceed about bits. 
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These considerations have arisen previously in the context of reversible programming ||14|| , 



but the rationale for constructing a function in the "fully-reversible" manner as specified 
by the output state (|T6| ) is somewhat different than in the classical context. In traditional 
reversible programming the object is to avoid the small energy cost involved in irreversible 
erasure of any of the working bits in the computer. If such an erasure is performed, the 
result of the computation will still be correct, even though the desired goal of expending 
no energy is not achieved. But in quantum computation, irreversible erasure of the state of 
register X in Eq. (|14]) actually causes register Y to be in the wrong quantum state, in so far 
that, if the initial X was in a superposition of computational states, the final state of Y will 
be a mixed quantum state, rather than the intended, pure superposition state. Thus, the 
consequences of irreversibility are more serious than in conventional reversible computation. 

A method for designing a calculation to arrive at the desired final states ([15]) or (|I~6|). 
as already worked out in the earlier literature [f[4]1 , requires two steps: 1) zero out S and 
any other workspaces used by the program, and 2) explictly implement the inverse of the 
Schumacher function Eq. (0). This can be accomplished by a program that, on input state 

\X,Y,S) = \x,y,0), (17) 

produces the final state 

\X,Y,S) = \x®r\y),y,0). (18) 
Note that applying such a transformation to the state 

\X,Y,S) = \x,f(x),0) (19) 

yields the required state 

\X,Y,S) = \0,f(x),0). (20) 

Eq. ( |T8"D is not implemented simply by running FIRST_TRY in reverse; indeed, the 
inverse function can have very different and much greater complexity than the function itself 
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T5[ . Fortunately, in this case, as we will see in a moment, the inverse Schumacher function 
is also relatively easy to implement. 

Step (1) above, zeroing out 5*, is readily performed by adding code to the end of 
FIRST_TRY to simply subtract away the bits which have been added to S: 



for j ; = to n — 1 do 

if Xj = 1 then S <- S - 1 

This code, added to the end of FIRST_TRY, produces the output state ([19]). 



Step (2) above, implementing the inverse function Eq. (|18), requires a new algorithm. 
We have not found any way to write the inverse Schumacher coding function as a formula 
as in Eq. fll2"P. Nevertheless, a straightforward algorithm can be deduced from the following 



two inequalities. The first is obtained by combining the information from Eqs. (f|) and (|5|): 



where 



i=0 \v i=0\ l , 



n-l 

m = J> fc (22) 



A;=0 

is the number of l's in the binary string x. We will be able to write simple pseudocode to 
compute m (a.k.a. S). This result can then be used to compute I[x,n,m] using Eq. 
I[x, n, m] satisfies an inequality which is a simple consequence of Eq. (|9]) and the discussion 
preceding it: 

( n -"- 1 )<Il X ,n, m ]<( n ~ P \ (23) 
\ m J \ m J 

By finding the p which satisfies this equation, we determine that the leading p bits of x are 
zeros, and the next bit is a 1 (i.e., Xj — 0, n — 1 — p < j < n — 1, x n _ p _i = 1). The index of 
the remaining substring can be determined from Eq. (^), and thus all the bits of x may be 
calculated recursively. 
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C. Deriving the final program 



Now we will transform our procedure into reversible code. As the last section makes 

clear, a necessary step for doing this will be to code the inverse of the Schumacher function. 

In the spirit of FIRST_TRY, we will not worry at first about the final state of the work 

registers as prescribed in Eq. (|I8D ; we will initially just try to code correctly the inverse 

function itself. We will find that reversibility will, in this case, fall out naturally from a 

simple modification of our first-cut program. 

Program TRYJNVERSE 
quantum registers: 

X : n-bit register (initialized to 0) 

Y : n-bit signed arithmetic register (finalized to 0) 

S : [log n\ -bit register (initialized and finalized to 0) 

for m = to n do 

if Y > then S <- S+l 
for m = to n do 

if S > m then Y «- Y + (™) 
for p = to n — 1 do 

for % = to n — p do 

if S = i and Y > (^f 1 ) then X n ^ v _ x <- X r ^ v _ x © 1 
if S = i and X n „ v _ x = 1 then Y <- Y - ( n ~ p ~ 1 ) 
if X n _ p _! = 1 then S <— S — 1 

In this code, the m-loop does the job of finding the m for which Eq. (ETJ) is satisfied, and 

putting the result in the quantum register S. As a byproduct of this work, it subtracts away 

the first term of Eq. (O) from y, leaving in Y the value of the index I[x,n,m]. Actually, 

the m-loop continues to subtract binomial coefficients from Y after it is supposed to; this 

is why Y is indicated to be a "signed" register, which can be handled by doing ordinary 

arithmetic in a register with one extra bit (see ||12|| ). This approach has the benefit that 

testing that Y is non-negative only requires the examination of one bit — see the first part 



of Sec. IV. We might be tempted to avoid negative numbers by terminating the loop at the 
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right moment, viz: 

for m = to n do 

if Y < (fy then exit for-loop 

But such an exit for-loop statement is not reversible. There appears to be no alternative to 
letting the first loop go to its maximum possible upper limit, which is n, and then repairing 
the damage done by adding back the correct binomial coefficients in the second loop. Finally, 
at the end of the second loop, Y has the desired value of I[x,n,m], and S has the value of 
m. 

Then the third (p) loop of TRYJNVERSE does the iterative decomposition of the 
index I[x, n, m]. For every possible value of the leading number of zeros p (recall Eq. (|7p), 
TRYJNVERSE checks to see if the inequality Eq. (|23| ) is satisfied; if it is, then the 
program negates one bit of the X register. Then the second if statement decrements Y by 
the combinatorial coefficient in Eq. (|S|), so that it always contains the index of the next 
substring. The process continues until the index is reduced to zero. Also, S is decremented 
so that it always contains the current value of the number of l's in the substring of Eq. 
Note that, as in FIRST_TRY, an inner loop (indexed by i) is introduced to avoid the need 
for reversible calculation of binomial coefficients like ( n 

We now evaluate what state TRYJNVERSE has left the registers Y and S in. In fact, 
a very desirable thing has "accidentally" occurred! We find that, on input state 

\X,Y,S) = \0,y,0), (24) 

TRYJNVERSE produces the final state 

|x,y,s) = |r 1 ( z/ ),o,o). (25) 

Thus, with a final transposing of the X and Y registers, we obtain a program that 
implements the inverse of Eq. (|16D, so the calculation has been successfully done in-place, 
with the registers Y and S remaining in their initial state, having served only as "catalysts" 
for the calculation. 
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In fact, we can do even better; by a small modification of TRY _IN VERSE, the Y 

register can be eliminated entirely. This can be done by noting that, during the course of 

an execution of TRYJNVERSE, the decrementing of Y sets each of its high-order bits 

to zero in succession, and, at the same time, the values of X are built up starting with the 

high-order bits and working down. Thus, the high-order bits of Y can be re-used to hold the 

results of the final calculation. It can be shown that these high-order bits are always cleared 

out soon enough that they can be used for the final answer; this is done by showing that in 

TRYJNVERSE, the same bits of X and Y are never simultaneously I. Thus, with one 

small modification, TRYJNVERSE can be turned into our final program for the inverse 

of the Schumacher coding function: 

Program FINAL_SCHUMACHERJNVERSE 
quantum registers: 

X : n-bit signed arithmetic register 

S : [log n\ -bit arithmetic register (initialized and finalized to 0) 

for m = to n do 

if X > then S <- S+l 
for m = to n do 

if S > m then X <- X + (™) 
for p = to n — 1 do 

for i = to n — p do 

if S = i and TRUNC n _ p _i(X) > ( n ~ p ~ 1 ) then X n _ p ^ <- X n _ p _i © 1 
if S = % and X n _ p _! = 1 then X <- X - ( n ~^~ 1 ) 
if X n _ p _! = 1 then S <- S - 1 

The only substantial item which has been added here is the function TRUNCj. Invo- 
cation of TRUNCj(X) simply says that only the j least significant bits of the quantum 
register X (i.e., bit to bit j — 1) should be taken account of in the ">" comparison. This 
is necessary because the high-order bits are being used to store the final answer. In the final 
pass through the p loop, the occurrence of the zero index in TRUNC (X) indicates that 
the comparison should not be performed at all. 

For completeness, we now record the final code for the Schumacher coding function itself. 
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Since FIN AL_S CHUM ACHERJN VERSE is done "in-place", the direct function is 

literally just the time-reverse: 

Program FINAL_SCHUMACHER 
quantum registers: 

X : n-bit signed arithmetic register 

S : [log n\ -bit arithmetic register (initialized and finalized to 0) 

for p = n — 1 down to do 

if X n _ p _! = 1 then S <- S + l 
for i = n — p down to do 

if S = % and X n _ v _ x = 1 then X <- X + (^r 1 ) 

if S = i and TRUNC„_ P _!(X) > ( n ~^~ 1 ) then X„_ p _x «- X„_ p _x © 1 
for m = n down to do 

if S > m then X <- X - Q 
for m = n down to do 

if X > then S ^ S-l 

X^X+( n ) 

IV. BIT-LEVEL QUANTUM PROGRAM FOR SCHUMACHER CODING 

In this section, we explain how the statements in programs FIN AL_S CHUM ACHER 
and FINAL_SCHUMACHER JNVERSE can be implemented by a gate-array with fun- 
damental bit-level operations. These fundamental operations are essentially Toffoli gates 
|T6| . The Toffoli gate that negates bit B iff bits C and D are both 1 (and doesn't change 
the values of C and D) is denoted as 
B <- B® (C AD). 



In [0 it is shown that such an operation can be simulated in terms of eight one-bit operations 

and eight XOR operations (which are of the form B <— B ©C). For convenience, we expand 

our repertoire of allowable basic operations to include 

B^B®1 

B^B@C 

B^B@C 

B <- B®(CAD) 

B «- B © (C A D) 
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B ^ B ®(CV D). 

As with Toffoli gates, each of these gates can be simulated by at most eight one-bit operations 
and eight XOR operations. In many cases a quantum phase freedom can be used to simulate 
these in fewer one- and two-bit gates (see Appendix [FJ). 

The first step to converting the programs into gate-arrays is to "unravel" the for loops. 
Since the ranges of these loops are all fixed prior to any computation, this is straightforward. 
Next, we note that (once the for loops have been unravelled) there are essentially five types 
of program statements: 

1. X ^ X + k 

2. if B then X <- X + k 

3. if Y > I then X <- X + k 

4. if Y = I and B then X <- X + k 

5. if Y = I and Z > k then B <- B © 1 

(where B is a bit, X, Y, Z are signed arithmetic registers, and k, I are signed integers). 

Also, there are a priori upper bounds on the ranges of the arithmetic registers (and thus 
on the number of bits required to specify them). An arithmetic register whose range of values 
is known to be an integer within [0, 2") can be naturally represented by n bits and arithmetic 
operations on it can be simulated by reversibly performing them modulo 2 n . Also, a signed 
arithmetic register whose range of values is known to lie within [— 2 n , +2 n ) can be naturally 
represented in "two's complement" form by n + 1 bits, and it is well known that arithmetic 
operations on such a two's complement integer can be simulated by interpreting it as an 
integer in the range [0, 2 n+1 ) and performing arithmetic modulo 2 n+1 (see, for example, |[L7|| ). 

A. Addition and Conditional Addition 

In view of the above discussion, to simulate 
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if B then X <- X + k, 

it suffices to perform 

X <— (X + B ■ k) mod 2 n 
(in other words to add k to X modulo 2 n iff B = 1). In the case where X is an n-bit signed 
register, it suffices to substitute n + 1 for n above. 

The program below performs this using n auxiliary bits Co, C\, . . . C n -\ (which are as- 
sumed to have initial value 0, and are reset to by the end of the computation). 

Program CONDITIONAL_ADD_fc 
quantum registers: 

X : n-bit signed arithmetic register 
B : bit register 

Cq, Ci, . . . , C n -i : bit registers (initialized and finalized to 0) 

for i = 1 to n — 1 do 

CiCCjeMAJ^Xi-i.Ci-i) 
for % — n — 1 down to 1 do 

Xi ^-Xi® (ki A B) 
Xi^Xi® {d A B) 
Ci*L R a ® MAJ(^_!, X^, Ct-O 
X «- X © A; 

where 

(SAT if Z = 
MAJ(/,5,T) = <^ 

I 5 V T if / = 1. 

The number of basic operations performed by the above program is bounded above by 
An + 0(1). In particular, if the for loops of this program are unravelled then the program 
corresponds to a gate- array consisting of 2n + 1 bits and An + 0(1) gates. (A more space- 
efficient (n + 0(i/n))-bit program is described in Appendix 0. 
The unconditional addition statement 
X ^-X + k 

can be easily simulated by replacing (ki A B) and (Ci A B) in the above program with ki and 
Ci (respectively). 

CONDITION AL_ADD introduces two modified assignment symbols " and 
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"^Gr". For the present purposes these can be thought of as identical to the ordinary 
" assignment; however, they signal a freedom in how the quantum phase may be handled 
in these assignments, as discussed in Appendix [B[ 

One final note about CONDITIONAL_ADD: it involves only the addition of a quan- 
tum register with an ordinary, classical number. It is possible to write a similar program 
which adds two quantum registers, as has been illustrated in [|l^]; however, this more com- 
plex routine is never needed for the implementation of the Schumacher function. Actually, 
it is generally possible to implement a full quantum adder as a sequence of calls to CON- 
DITION AL_ADD. 



B. Equality and Inequality Testing 

In order to simulate the remaining types of statements, it suffices to simulate equality 

test statements of the form 

B <- B © (A = k) 

(which negate B iff A = k), and inequality test statements of the form 

B <- B © (A > k) 

(which negate B iff X > k) . 

With implementations of the above tests, the statement 

if Y > I then A <- A + k 

is then easily simulated by the sequence 

B^B® (Y > I) 

if B then A <- A + k 

B«= R B@{Y > I) 

where B is a bit register distinct from the bits of A and Y, and whose initial value is (note 
that B must be reset to after the addition is performed). Also, the compound conditional 

if Y = l and B then A <- A + k 
is simulated by the sequence 
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C © (Y = I) 

if D then X <— X + k 

D^ R D@(C AB) 
C^ R C @{Y = 1) 

where C and D are bit registers distinct from the bits of X, Y, and B, and whose initial (and 

final) values are 0. Again, the meaning and usefulness of the phase-modified assignments is 

discussed in Appendix |B|. 

The following program simulates an equality test. It uses n auxiliary bit registers 

C ,Ci, . . . , C n -\. The auxiliary registers are initialized to 0, and have final value 0. 

Program TEST_EQUALITY_TO_/c 
quantum registers: 

X : n-bit signed arithmetic register 
B : bit register 

C , Ci, . . . , C n _i : bit registers (initialized and finalized to 0) 

C n -i^G C n _i © (A n _x = k n _i) 
for i = n — 2 down to do 



The number of basic operations performed by the above program is bounded above by 
2n + 0(l). (The above program is very similar to the so-called A n -gate construction in |[L0|| ). 

Finally, the following program simulates an inequality test. It uses n auxiliary bit reg- 
isters C , C%, . . . , C n -\. The auxiliary registers are initialized to 0, and have final value 



Cj*e Ci © (Cj+i A (Xi — hi)) 
B^B®C 
for i = to n — 2 do 




where 




1. 



0. 



Program TEST_GREATER_THAN_A; 
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quantum registers: 

X : n-bit signed arithmetic register 
B : bit register 

C , Ci, . . . , C n _i : bit registers (initialized and finalized to 0) 

Cn-l 4 ^ Cn-1 © (X n -i = k n _i) 

B^B® (X n _i < fc^-i) 
for i = n — 2 down to do 

C i ^C i @{C i+x A{X i = k i )) 

B« B © C i+1 A (Jfj > h) 
for z = down to n — 2 do 

Ci^ R a © (a+i a (Xi = h)) 

where (S = I) is as in the previous subsection, 

S if 1 = 
H 1 = 1, 

if I = 
5 if / = 1. 

The number of basic operations performed by the above program is bounded above by 
3n + 0(1). Once again, we employ phase-modified assignments <-€E , , and which are 
explained in Appendix [FJ. 

V. DISCUSSION AND CONCLUSIONS 

We can finally put all the above results together to evaluate the total cost, in time and 
space, to perform Schumacher coding. It is easy to see that the two if statements inside the 
i loop of FIN AL_S CHUM ACHER are the most expensive part of the procedure. The 
first if statement requires one call to CONDITIONAL_ADD. Although X is an n-bit 
register, the addition only affects the n — p — 1 low-order bits of X. Thus, the addition can 
be performed on TRUNC n _ p _i(X) rather than X, which amounts to a total running time 
of 



and 



(S>1) 



(S<1) 
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n— 1 n—p 

^^4(n-p-l) + 0(l) = §n 3 + 0(n 2 ). (26) 

The expensive part of the second if statement is its two calls to TEST_GREATER_ 
-THAN, performed on an (n — p — l)-bit quantum register (because of the action of 
TRUNC). The time involved for this is 

n— 1 n—p 

Y. 2 - 3 ( n ~P- !) =n 3 + 0(n 2 ). (27) 

p=0 i=0 

Thus, the total time required (i.e., number of bit-level primitive steps) is |n 3 + 0(n 2 ). The 
total number qubits used is: n, to hold the input/output string X; plus [logn], to hold 
S] plus n + 0(1) to implement the conditional additions and inequality tests (the same 
work registers that store carries and so forth may be reused throughout the execution of the 
program). Thus, the total number of qubits is 2n + [logn] + 0(1). 

If the space-efficient routines CONDITIONAL_ADD' and TEST_GREATER_ 
-THAN' introduced in Appendix [A] are used instead, the execution time is increased to 
|n 3 + 0(n 2,5 ), but the total number of qubits is reduced to n + + O(logn). If the 
relevant figure of merit for the tractability of the quantum computation is the product of 
time and space, as it is in certain physical models [0|§, then the space-efficient procedures 
we have introduced would be preferred. 

A final note about these operation counts: they are all in terms of the primitive operations 
listed at the beginning of Section |IV|, which includes both two- and three-bit primitives. It is 
known |I0| , |I8|| that all three-bit operations can be simulated in quantum logic by a sequence of 



two-bit primitives. Most of the three-bit operations can be simulated using seven operations 
(3 quantum XORs and 4 one-bit gates); see Appendix [TJ. So, in terms of these primitive 
operations the total time to do the Schumacher function would be roughly 7 • |n 3 < 19n 3 . 
Computing the exact prefactor would require a considerable amount of detailed calculation, 
and would have to take into account that fact that many one-bit gates in the network could 
be merged together and executed in one step (see jlOfl ). All of this work could easily be done 
if an actual physical implementation of Schumacher compression were ever undertaken. 
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To conclude, we believe that the pseudocode in which our results are presented is the most 
concise and economical form in which to present a quantum computation like Schumacher 
coding. The bit level primitives for addition and comparison which we have presented are 



similar to ones which have been presented elsewhere ||12|| , but have a few features which 
may make them superior in the development of other quantum programs. The Schumacher 
coding can be done in 0(n 3 ) steps, with 0(y/n) auxiliary workspace. We cannot exclude the 
possibility that a lower polynomial-order algorithm may be found, but we are not presently 
aware of what form this would take. The techniques in [14| enable further shrinkage of 



the auxiliary workspace, but with a larger penalty in the running time. We think that 
further useful shrinkage of the auxiliary workspace is unlikely; in the present scheme, only 
a vanishingly small fraction of quantum bits are used as workspace for large blocksize n. 
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APPENDIX A: IMPROVEMENTS IN THE WORKSPACE EFFICIENCY OF 
THE BIT-LEVEL IMPLEMENTATIONS 



The bit-level implementations proposed in Sections [IV A| and [IV B| require n auxiliary 



bit registers. By applying techniques that were introduced in [14], we derive the following 



alternate programs that employ only 0{\/n) auxiliary bit registers while maintaining the 



same asymptotic operation complexity. (The space-reduction techniques in |TJj] , can also be 
used to reduce the auxiliary space further, but this incurs an increase in the running time, 
as well as in the space-time product.) 

Assume that n = m? . The program CONDITION AL_ADD'_fc that follows employs 
2m — 1 auxiliary bit registers rather than the n auxiliary bit registers that CONDI- 
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TIONAL_ADD_A; employs. In CONDITIONAL_ADD_/c, registers C , . . . , C n _i are 
used to store information about carry propagation. In CONDITIONAL_ADD'_/c, this is 
accomplished by registers Ci, . . . , C m _i and D 1 , . . . , D m _i instead. The idea is to reset some 
of the registers to at various checkpoints during the course of the computation. This is 
illustrated by the diagram below, where the horizontal direction represents time, and the 
placement of the lines indicate the time intervals during which the registers are active, con- 
taining the various carry bits. Registers Di, . . . , D m -i, C\ are first set to the first m carry 
bits. Then D±, . . . , -D m _i are reset to 0. Registers D\, . . . , An-i, C 2 can then be used to 
store the m + 1 st to 2m th carry bits and then D ± , . . . , D m _ 1 are reset to again — since C\ 
stores the m th carry bit, this can be accomplished without recomputing the first m carry 
bits. The process is repeated with the remaining carry bits, and then applied in reverse to 
reset the carry bits to 0, as illustrated here: 



carry bits registers used 

m 2 - 1 An-l— 

(m - l)m + 2 D 2 ^_ 

(m — l)m + 1 Di 

(m - l)m C m _i_ 

3m Co '■ 

3m - 1 An-i— 

2m + 2 D 2 I 

2m + 1 D 1 

2m C 2 

2m - 1 -D m _i_ 

m + 2 D 2 — 

m + 1 D 1 

m Ci 

m - T) m -i — 
2 D 2 ■ 
1 D 1 



The detailed program follows. D is used for convenience to store the value of Ci at the 
beginning of each iteration of the for- loop with respect to i. 
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Program CONDITIONAL_ADD'_A; 
quantum registers: 

X : n-bit arithmetic register 
B : bit register 

Ci, C2, . . • , C m _i : bit registers (initialized and finalized to 0) 
D Q , Di, . . . , D m _i : bit registers (initialized and finalized to 0) 

for i = to m — 2 do 

if i > then D ^ D ® Q 
for j = 1 to m - 1 do 

Dj © MAJ(fc im+j _i, X im+j _i, Dj-x) 
Cj + i^G Cj+i © MAJ(fcj m+m _i, Xj m+m _i, D m _i) 
for j =m — l down to 1 do 

Dj^EnDj © MAJ(fc im+J -_i,X im+J -_i,.Dj_i) 
if i > then D ^ D ® Q 

D <- © C m _i 

for j = 1 to m - 1 do 

-Dj^G -Dj © MAJ(fc( m _ 1 ) m+i _ 1 ,X( m _ 1 ) m+ j_ 1 , D^-i) 
for j — m — 1 down to 1 do 

-X"(m-l)m+j ^~ ^(m-l)m+j © (&(m-l)m+j A B) 
AT( m _i) m +j -X"(m-l)m+j © (-Dj A -B) 

Dj«E R Dj © MAJ(/s( m _i) m+3 _i, X( m _i) m+i _i, 
D ^ D ® C m _i 

for i = m — 2 down to do 

if i > then D ^ D ® Q 
for j = 1 to m - 1 do 

© MAJ(fc im+i _i, X im+J -_i, 

-^■im+m i -^im+m © (^im+m A -B) 
A^m+m <— X im+m © (Cj + i A B) 

Cj+i^ijCj+i © MAJ(/c im+m _i, Xj m+m _i, D m -i) 
for j = m — 1 down to 1 do 

Xim+j < Xi m +j © {kim+j AS) 

ATim+j <— Xi m+ j Q) (Dj A B) 
Dj^EnDj © MAJ(fe im+J -_i, Xjm+j-i, D 3 --i) 
if % > then D <— -Do © Q 

X <- X © k 
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The above program uses n+0(^/n) registers in total and runs in 6n+0(T/n) steps (compared 
to 2n + O(logra) registers in total and An + 0(1) steps for CONDITION AL_ADD_A;). 

There also exist more space-efficient versions of TEST_EQUALITY_TO_A; and 
TEST_GREATER_THAN_/c. For the former case, the program is as follows (where 
again n = m 2 ). 

Program TEST_EQUALITY_TO / _/c 
quantum registers: 

X : ra-bit arithmetic register 
B : bit register 

Cq, Ci, . . . , C m _i : bit registers (initialized and finalized to 0) 
Di, D 2 , . . . , D m : bit registers (initialized and finalized to 0) 

for % — m — 1 down to do 

if i — m — 1 then D m <— D rn © 1 else D m <— D m © C i+ i 
for j = m — 1 down to 1 do 

Dj^£. Dj © (Dj + i A (Xi m+ j = ki m+ j)) 
Cj*€ Ci © (Di A (X im — k im )) 
for j 1 = 1 to m — 1 do 

Dj^rDj © A (X im+j = fc im+i7 -)) 

if % — m — 1 then D m <— _D m © 1 else L> m <— _D m © C i+1 



B ^ 5 © C 



for i = to m - 1 do 

if i = m — 1 then _D m <— D m © 1 else _D m <— L> m © Cj+i 
for j — m — 1 down to 1 do 

Dj^E Dj © (D j+1 A (X im+j = k im+j )) 
Ci«E R Ci © (D 1 A (X im = k im )) 
for j ; = 1 to m — 1 do 

Dj^GrDj © (-Dj+i A (Xj m+ j = ki m+ j)) 
\i i — m—1 then _D m <— _D m © 1 else D m <— L> m © Cj+i 

The above program uses n+0(v^i) registers in total and runs in 4n+0(y / n) steps (compared 
to In + 0(1) registers in total and In + 0(1) steps for TEST_EQU ALITY_TO_/c) . 

The program TEST_GREATER_THAN'_/c is similar to the above attaining 5n + 
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0(\/n) time and n + 0{^/n) space (vs. 3n + 0(1) time and 2n + 0(1) space) 



APPENDIX B: PHASE FREEDOM IN IMPLEMENTATION OF REVERSIBLE 

ROUTINES 



Here we will explain ways in which the quantum phase can be treated in the essentially 
classical reversible routines which we have been discussing throughout this paper. In the 
language of quantum logic gates, the bit-level logic statements used in the programs here are 
represented by unitary matrices applied to the quantum wavefunction of all the registers. 
These unitary matrices have a special restriction which make them "classical" , which is that 
the matrix elements are only zero or one; this means that every definite computational state 
\x) is taken to another definite computational state \f(x)), and not to a superposition of 
states. To give an example, the Toffoli gate, the three-bit implementation of the AND gate 
in reversible logic, involves the following unitary matrix: 



V 



1 

1 

1 

1 

1 

1 

1 

1 



(Bl) 



Here we consider the question of whether elementary operations with modified phases 
(i.e., in which the matrix elements are unimodular complex numbers e %e , rather than being 
1) could be used in the implementation of the Schumacher function. We are motivated to 
investigate this because we found in our previous study |10| that the implementation of a 
modified Toffoli gate with a single non-zero phase 
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V 



1 

1 

1 

1 
0000-1000 

1 

1 

1 



(B2) 



requires fewer resources in the following sense: we showed that the zero-phase Toffoli gate 
(Eq. ( pTf) ) can be implemented with 8 two-bit XOR gates and 8 one-bit gates, while the 
Toffoli gate with modified phases (Eq. flB2|)) requires only 3 XOR's and 4 one- bit gates. We 
will establish here that the less-costly gate can in fact be used for most of the Toffoli gates, 
and related three-bit operations, that are used in the implementation of the Schumacher 
compression function. 

Note that it is necessary that the complete Schumacher calculation be carried out with 
all the quantum phases equal to zero, in order that the superposition states discussed in 
Section | maintain the correct phase relation to one another. Thus the question becomes: 
how can the effect of the non-zero phase in Eq. ([E2|), if it is introduced in one Toffoli gate, be 
undone at some later step of the calculation? The answer (which we will establish shortly) 
is the obvious one: many of the reversible routines which we have introduced (although not 
the high-level Schumacher program itself) have a palindromic character, so that a Toffoli 
gate on three bits is exactly "undone" at a later stage of the computation, roughly as far 
from the end of the subroutine as the original gate is from the beginning. It turns out that 
the effect of the —1 phase factor can be precisely undone at the second occurrence of the 
gate, too. 
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./ 



-1 



bits 1 to n 



bits n + 1 to m 



We will now establish the desired basic result using the setup of the figure, that the 
boolean function / can be implemented with any arbitrary phase factors, so long as they 
also appear in no matter what the intervening boolean function g is, so long as g does 
not modify the values of the bits on which / and f~ l act. By applying this result repeatedly 
to the subroutines which we have introduced, starting at the innermost level, we deduce 
all the three-bit primitives which can be implemented with non-zero phase. Assignments in 
which these non-zero phases are permitted have been identified by the special assignment 
symbol 



These statements are always paired with others, denoted by 



(B3) 



(B4) 



in which the reverse phases are implemented. (For the phases in Eq. (B2), the operation 
is self- inverse.) In one case, the pairing is between statements in different, palindromically- 
arranged calls of the same subroutine; for these we have used a distinct symbol 



(B5) 
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After establishing the basic result, we will review a few of the details of this implementation 
in the Schumacher function. 

Let us first write down what the set of operations in the figure is supposed to do. 
Beginning with the basis state 

m bits 

\x) = | xix 2 ...x n ...x m ) (B6) 

at time t\, it becomes at t 2 , after the operation of /, 

n bits 

, * s 

\x') = | f(x 1 x 2 ...x n )x n+1 ...x m ) (B7) 

Then at time t% the state is 

n bits 
, * > 

exp(i9 g (x))\ f(xix 2 ...x n ) g{x)). (B8) 

m — n bits 

g(x') depends on the state of the entire m-bit register x' , but only modifies the last m — n 
bits, as indicated. Note that we allow for the possibility that g itself is a modified boolean 
function with non-zero phases. This is necessary because we will apply this result in a nested 
fashion in the Schumacher subroutines. Finally at time £4 the state is 

exp(i9 g (x))\xix 2 ...x n g(x')) . (B9) 

That is, the first n bits are restored to their original state, and bits n + 1 through m remain 
in the state g(x). 

Now, the question is, will the state Eq. (|B9|) still result if the function / is modified to 
introduce non-zero phases 8f(xi...x n )7 If we establish that this is true for all boolean inputs 
\x), this will suffice to prove that these networks have the same action on any arbitrary 
quantum states (this follows directly from the linear superposition principle of quantum 
mechanics). We follow the time evolution as before with the modified /. At time t 2 the 
state is 

exp(i9 f (x 1 ...x n ))\f(x 1 x 2 ...x n )x n+1 ...x m ) (BIO) 
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Then at time £3: 

exp(i(0 fl (aO + 9 f (x 1 ...x n )))\f(x 1 x 2 ...x n )g(x')) (Bll) 

and finally at t±: 

exp(i(9 f (x 1 ...x n ) + g (x') + 9 f -i{f{x 1 ...x n ))))\x 1 x 2 ...x n g{x')) (B12) 

The final term in the phase factor can be simplified. Recall that the unitary transformation 
corresponding to f~ l is the transpose of the complex conjugate of the unitary transformation 
corresponding to /. (This follows directly from the definition of unitarity.) Therefore, to get 
8f-i from 8f,we flip the sign (this is the complex conjugation), and we make the argument of 
the 9 function the output values of the bits rather than the input values (this is the transpose). 
Here we use the fact that g does not modify the first n bits — their output values are the 
same as the original inputs x\x 2 ---x n . Rendering this in mathematical language: 

9 f -x(f(x 1 ...x n )) = -9 f (x 1 ...x n ) (B13) 



Thus, the two 9f terms in the phase in Eq. ( |B12j ) cancel out, and Eq. ( |B12p becomes 



identical to Eq. flB5|), which is the desired result. □ 

Finally, we briefly review the application of this result to the programs introduced in the 
text. The first appearance is in CONDITIONAL_ADD_fc, where the role of / is played 
in the innermost part of the program by the assignment statement 

C n _!<€ C n _! © MAJ(fc n _ 2 , X„_ 2 , C n _ 2 ) 

This is a three-bit operation of the Toffoli type (or a trivial modification of it) involving the 
bits C n -i, C n -2, and X n _ 2 . f^ 1 occurs a short distance down, 

C n _i^ B C n _! © MAJ(/c„_ 2 , X„_ 2 , C n „ 2 ) 

The role of g is played by the two statements 
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X„_! «- X„„i © (/c n _i A B) 
X n -i <— © (C n _i A B) 

Obviously, only X„_i is modified by g, so the condition that g modify only bits not touched 

by / is satisfied; so, we are allowed to introduce a phase- modified / as indicated by the +€E 

and <KEj? assignments. Moving away from the innermost part of the program, we see that all 

the above is nested inside a larger g in which C n _i, X n _ 2 , and X n _i are modified, surrounded 

by a / — f^ 1 pair involving the bits C„_ 2 , X„- 3 , and C n _ 3 ; working outward in succession 

this way, we conclude that all the C\ assignments may be replaced with phase-modified <-E 

and assignments. 

In Section |1V ij| we exhibit a pair of statements 

B^B® (Y > I) 
B«= R B@{Y > I) 

playing the role of / and These are not primitive three-bit operations as in 

the earlier examples, but they are themselves implemented with bit-level programs 
(TEST_GREATER_THAN_fc). For this f' 1 , the «E R assignment requires that the bit- 
level routine be run in the time-reversed order. This can be done since classically, this 
boolean function is its own inverse. In the time-inverted TEST_GREATER_THAN_/c, 
the +g 's and 's should be interchanged. The B assignment involving the symbol in 
this routine is special, in that it is paired with the same statement in the time-reversed call 
to TEST_GREATER_THAN J;. This special symbol is a reminder is that this statement 
should be implemented with the phases corresponding to the ^€ assignment in the first call 
to the program, and with those corresponding to the •KEjj assignment in the second call. 

We have not indicated phase-modifying assignments for any of the two-bit gate level op- 
erations in these programs. We take as given that these two-bit gates could be implemented 
with zero phases. But if this were not the case, then many of these paired assignments may 
be phase-modified in exactly the way we have shown for the three-bit primitives. 
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